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YASUO et al. 
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TextTiling: segmenting text into multi-paragraph subtopic passages | 
Marti A. Hearst 

March 1997 Computational Linguistics, Volume 23 issue l 

Full text available. ^ pd f(2.46 MB) ^ Additional Information: full citation , abstract , references , citings 
Publisher Site 

TextTiling is a technique for subdividing texts into multi-paragraph units that represent 
passages, or subtopics. The discourse cues for identifying major subtopic shifts are patterns 
of lexical co-occurrence and distribution. The algorithm is fully implemented and is shown to 
produce segmentation that corresponds well to human judgments of the subtopic 
boundaries of 12 texts. Multi-paragraph subtopic segmentation should be useful for many 
text analysis tasks, including information retrieval and ... 

Document Formatting Systems: Survey, Concepts, and Issues | 
Richard Furuta, Jeffrey Scofield, Alan Shaw 

September 1982 ACM Computing Surveys (CSUR), volume 14 issue 3 

Full text available: pdf(5.36 MB) Additional Information: full citation , references , citings, index terms 



CHECK: a document plagiarism detection system 
Antonio Si, Hong Va Leong, Rynson W. H. Lau 

April 1997 Proceedings of the 1997 ACM symposium on Applied computing 

Full text available: ^ pdf(807.83 KB) Additional Information: full citation , references , citings, index terms 



Keywords: copy detection, digital libraries, document plagiarism, information retrieval 



ViSWeb — the Visual Semantic Web: unifying human and machine knowledge 
representations with Object-Process Methodology 

Dov Dori 

May 2004 The VLDB Journal — The International Journal on Very Large Data Bases, 

Volume 13 Issue 2 

Full text available: ^ pdf(1.22 MB) Additional Information: full citation , abstract , index terms 

The Visual Semantic Web (ViSWeb) is a new paradigm for enhancing the current Semantic 
Web technology. Based on Object- Process Methodology (OPM), which enables modeling of 
systems in a single graphic and textual model, ViSWeb provides for representation of 
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knowledge over the Web in a unified way that caters to human perceptions while also being 
machine processable. The advantages of the ViSWeb approach include equivalent graphic- 
text knowledge representation, visual navigability, semantic sentenc ... 

Keywords: Conceptual graphs, Knowledge representation, Object-Process Methodology, 
Semantic Web, Visual Semantic Web 



5 Concepts of the text editor Lara 
J. Gutknecht 

September 1985 Communications of the ACM, volume 28 issue 9 

ii > ■ ■, u. « cn *, D \ Additional Information: full citation , abstract , references , citings , index 
Full text available: TOpdf(1.60 MB) 4 : 

terms , review 

Lara, a text editor developed for the Lilith workstation, exemplifies the principles underlying 
modern text-editor design: a high degree of interactivity, an internal data structure that 
mirrors currently displayed text, and extensive use of bitmap controlled displays and 
facilities. 

6 Document formatting: Creating reusable well-structured PDF as a sequence of 
component object graphic (COG) elements 

Steven R. Bagley, David F. Brailsford, Matthew R. B. Hardy 

November 2003 Proceedings of the 2003 ACM symposium on Document engineering 

_ .. , . , u. « MAtzoMv^ Additional Information: full citation , abstract, references , citings, index 
Full text available: pdf(458.01 KB) terms 

Portable Document Format (PDF) is a page-oriented, graphically rich format based on 
PostScript semantics and it is also the format interpreted by the Adobe Acrobat viewers. 
Although each of the pages in a PDF document is an independent graphic object this 
property does not necessarily extend to the components (headings, diagrams, paragraphs 
etc.) within a page. This, in turn, makes the manipulation and extraction of graphic objects 
on a PDF page into a very difficult and uncertain process.The wo ... 

Keywords: PDF, form Xobjects, graphic objects, tagged PDF 



Multimedia document presentation, information extraction, and document formation in | 
MINOS: a model and a system 

S. Christodoulakis, M. Theodoridou, F. Ho, M. Papa, A. Pathria 

December 1986 ACM Transactions on Information Systems (TOIS), Volume 4 issue 4 

_ ii t • .. 0 , M4CMm Additional Information: full citation , abstract, references , citings, index 
Full text available: TO pdf(3. 16 MB) 

terms , review 

MINOS is an object-oriented multimedia information system that provides integrated 
facilities for creating and managing complex multimedia objects. In this paper the model for 
multimedia documents supported by MINOS and its implementation is described. Described 
in particular are functions provided in MINOS that exploit the capabilities of a modern 
workstation equipped with image and voice input-output devices to accomplish an active 
multimedia document presentation and browsing within docu ... 

Proximal nodes: a model to query document databases by content and structure | 
Gonzalo Navarro, Ricardo Baeza-Yates 

October 1997 ACM Transactions on Information Systems (TOIS), volume is issue 4 

r- * i ui 0i . wccn> , ol/m Additional Information: full citation , abstract , references , citings , index 

Full text available: TO pdf(550.43 KB) r 

l£H ^ terms , review 

A model to query document databases by both their content and structure is presented. The 
goal is to obtain a query language that is expressive in practice while being efficiently 
implementable, features not present at the same time in previous work. The key ideas of 
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the model are a set-oriented query language based on operations on nearby structure ( 
elements of one or more hierarchies, together with content and structural indexing and 
bottom-up evaluation. The model is evaluated in regard t ... 

Keywords: expressivity and efficiency of query languages, hierarchical documents, 
structured text, text algebras 



9 Sequential thematic organization of publications: how to achieve coherence in 

proposals and reports 

J. R. Tracey, D. E. Rugh, W. S. Starkey 

August 1999 ACM SIGDOC Asterisk Journal of Computer Documentation, volume 23 issue 3 
Full text available: Q pdf(3.80 MB) Additional Information: full citation , index terms 



1 ° Automatically generated hypertext versions of scholarly articles and their evaluation 
James Blustein 

May 2000 Proceedings of the eleventh ACM on Hypertext and hypermedia 

Full text available: pdf(574.75 KB) Additional Information: full citation , references , citings , index terms 



Keywords: World Wide Web, automated linking, browsing, digital library, electronic 
journal, evaluation, hypertext, information retrieval, usability 



11 Pen computing: a technology overview and a vision ■ 
Andre Meyer 

July 1995 ACM SIGCHI Bulletin, Volume 27 Issue 3 

Full text available: ^ pdf(5.14MB) Additional Information: full citation , abstract , citings, index terms 

This work gives an overview of a new technology that is attracting growing interest in public 
as well as in the computer industry itself. The visible difference from other technologies is in 
the use of a pen or pencil as the primary means of interaction between a user and a 
machine, picking up the familiar pen and paper interface metaphor. From this follows a set 
of consequences that will be analyzed and put into context with other emerging 
technologies and visions. Starting with a short historic ... 

12 Mobile data management: Mimic: raw activity shipping for file synchronization in mobile j| 
file systems 

Tae-Young Chang, Aravind Velayutham, Raghupathy Sivakumar 
June 2004 Proceedings of the 2nd international conference on Mobile systems, 
applications, and services 

Full text available: ^ pdf(334.54 KB) Additional Information: full citation , abstract , references , index terms 

In this paper, we consider the problem of file synchronization when a mobile host shares 
files with a backbone file server in a network file system. Several diff schemes have been 
proposed to improve upon the transfer overheads of conventional file synchronization 
approaches which use full file transfer. These schemes compute the binary diff of the new 
file with respect to the old copy at the server and transfer the computed diff to the server 
for file-synchronization. Howev ... 

Keywords: file synchronization, mobile file system, raw activity shipping 



13 Document creation I: Creating structured PDF files using XML templates B 
Matthew R. B. Hardy, David F. Brailsford, Peter L. Thomas 

October 2004 Proceedings of the 2004 ACM symposium on Document engineering 
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Full text available: ^pdf(166.87 KB) Additional Information: full citation , abstract , references , index terms 

This paper describes a tool for recombining the logical structure from an XML document 
with the typeset appearance of the corresponding PDF document. The tool uses the XML 
representation as a template for the insertion of the logical structure into the existing PDF 
document thereby creating a Structured/Tagged PDF. The addition of logical structure adds 
value to the PDF in three ways: the accessibility is improved (PDF screen readers for 
visually impaired users perform better) media options a ... 

Keywords: PDF, XML, logical structure insertion 



14 Interactive Editing Systems: Part II H 
Norman Meyrowitz, Andries van Dam 

September 1982 ACM Computing Surveys (CSUR), Volume 14 issue 3 

Full text available: Q pdf(9.17 MB) Additional Information: full citation , references , citings , index terms 



15 Customizing information capture and access 
Daniela Rus, Devika Subramanian 

January 1997 ACM Transactions on Information Systems (TOIS), volume 15 issue i 

Full text available: ffi pdff1.26MBl Additional lnformation: fu " citation ' references ' £Bta». 

terms , review 

This article presents a customizable architecture for software agents that capture and 
access information in large, heterogeneous, distributed electronic repositories. The key idea 
is to exploit underlying structure at various levels of granularity to build high-level indices 
with task-specific interpretations. Information agents construct such indices and are 
configured as a network of reusable modules called structure detectors and segmenters. We 
illustrate our architectu ... 

Keywords: information gathering, software agents, table recognition 

16 Adapting content to mobile devices: Fractal summarization for mobile devices to 

access large documents on the web 
Christopher C. Yang, Fu Lee Wang 

May 2003 Proceedings of the 12th international conference on World Wide Web 

Full text available: Qpdf(317.55 KB) Additional Information: full citation , abstract , references , index terms 

Wireless access with mobile (or handheld) devices is a promising addition to the WWW and 
traditional electronic business. Mobile devices provide convenience and portable access to 
the huge information space on the Internet without requiring users to be stationary with 
network connection. However, the limited screen size, narrow network bandwidth, small 
memory capacity and low computing power are the shortcomings of handheld devices. 
Loading and visualizing large documents on handheld devices bee ... 

Keywords: document summarization, fractal summarization, handheld devices, mobile 
commerce 
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conference on Management of data, volume 24 issue 2 
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Full text available: ^ pdf(1.51 MB) team 

In a digital library system, documents are available in digital form and therefore are more 
easily copied and their copyrights are more easily violated. This is a very serious problem, 
as it discourages owners of valuable information from sharing it with authorized users. 
There are two main philosophies for addressing this problem: prevention and detection. The 
former actually makes unauthorized use of documents difficult or impossible while the latter 
makes it easier to discover such activity.I ... 

19 Converting a textbook to hypertext 
Roy Rada 

July 1992 ACM Transactions on Information Systems (TOIS), volume 10 issue 3 

r- .. * * i ui 0i JfM Additional Information: full citation , abstract , references , citings , index 

Full text available: TO pdf(1.46MB) : 

terms , review 

Traditional documents may be transformed into hypertext by first reflecting the document's 
logical markup in the hypertext (producing first-order hypertext) and then by adding links 
not evident in the document markup (producing second-order hypertext). In our 
transformation of a textbook to hypertext, the textbook is placed in an intermediate form 
based on a semantic net and is then placed into the four hypertext systems: Emacs-Info, 
Guide, HyperTies, and Super-Book. The first-order Guide a ... 

Keywords: document markup, electronic publishing, human-computer interaction, 
hypermedia models 

20 Special issue on natural language generation: Generating natural language summaries 
from multiple on-line sources 

Dragomir R. Radev, Kathleen R. McKeown 

September 1998 Computational Linguistics, volume 24 issue 3 

Full text available:^ oe kMD . [fl] 

^g]pdf(2.36 MB)^]h Additional Information: full citation , abstract , references , citings 

Publisher Site 

We present a methodology for summarization of news about current events in the form of 
briefings that include appropriate background (historical) information. The system that we 
developed, SUMMONS, uses the output of systems developed for the DARPA Message 
Understanding Conferences to generate summaries of multiple documents on the same or 
related events, presenting similarities and differences, contradictions, and generalizations 
among sources of information. We describe the various components ... 
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J. R. Tracey, D. E. Rugh, W. S. Starkey 

August 1999 ACM SIGDOC Asterisk Journal of Computer Documentation, Volume 23 issue 3 
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Pen computing: a technology overview and a vision | 
Andre Meyer 

July 1995 ACM SIGCHI Bulletin, Volume 27 issue 3 

Full text available: ^ pdf(5.14 MB) Additional Information: full citation , abstract , citings , index terms 

This work gives an overview of a new technology that is attracting growing interest in public 
as well as in the computer industry itself. The visible difference from other technologies is in 
the use of a pen or pencil as the primary means of interaction between a user and a 
machine, picking up the familiar pen and paper interface metaphor. From this follows a set 
of consequences that will be analyzed and put into context with other emerging 
technologies and visions. Starting with a short historic ... 



3 Mobile data management: Mimic: raw activity shipping for file synchronization in mobile J 
file systems 

Tae-Young Chang, Aravind Velayutham, Raghupathy Sivakumar 
June 2004 Proceedings of the 2nd international conference on Mobile systems, 
applications, and services 

Full text available: ^ pdf(334.54 KB) Additional Information: full citation , abstract , references , index terms 

In this paper, we consider the problem of file synchronization when a mobile host shares 
files with a backbone file server in a network file system. Several diff schemes have been 
proposed to improve upon the transfer overheads of conventional file synchronization 
approaches which use full file transfer. These schemes compute the binary diff of the new 
file with respect to the old copy at the server and transfer the computed diff to the server 
for file-synchronization. Howev ... 

Keywords: file synchronization, mobile file system, raw activity shipping 
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Structure and transformation of documents: Mapping and displaying structural 
transformations between XML and PDF 
Matthew R. B. Hardy, David F. Brailsford 

November 2002 Proceedings of the 2002 ACM symposium on Document engineering 

i- ii* ^ i ui 0. jf/ionnoi/o\ Additional Information: full citation , abstract, references , citings, index 
Full text available: pdf(439.03 KB) terms 

Documents are often marked up in XML-based tagsets to delineate major structural 
components such as headings, paragraphs, figure captions and so on, without much regard 
to their eventual displayed appearance. And yet these same abstract documents, after 
many transformations and 'typesetting' processes, often emerge in the popular format of 
Adobe PDF, either for dissemination or archiving. Until recently PDF has been a totally 
display-based document representation, relying on the underlying PostSc ... 

Keywords: PDF, XML, document structure transformation 



6 Fast detection of communication patterns in distributed executions | 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced Studies 
on Collaborative research 

Full text available: ^ pdf(4.21 MB) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based on 
process-time diagrams are often used to obtain a better understanding of the execution of 
the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not provide 
the user with the desired overview of the application. In our experience, such tools display 
repeated occurrences of non-trivial commun ... 

7 Level II technical support in a distributed computing environment | 
Tim Leehane 

September 1996 Proceedings of the 24th annual ACM SIGUCCS conference on User 
services 

Full text available: ^ pdf(5.73 MB) Additional Information: full citation , references , index terms 



8 Proceedings of the SIGNUM conference on the programming environment for 

development of numerical software 

March 1979 ACM SIGNUM Newsletter, volume 14 issue l 

Full text available: ^ pdf(5.02 MB) Additional Information: full citation 



9 An experimental multimedia mail system 

Jonathan B. Postel, Gregory G. Finn, Alan R. Katz, Joyce K. Reynolds 

January 1988 ACM Transactions on Information Systems (TOIS), volume 6 issue l 

_ , i .. . i fk _ CA .. DX Additional Information: full citation , abstract , references , index terms . 

Full text available: TO pdf(1.50 MB) - — : 

^ review 

A computer-based experimental multimedia mail system that allows the user to read, 
create, edit, send, and receive messages containing text, images, and voice is discussed. 
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10 Special issue on knowledge representation 
Ronald J. Brachman, Brian C. Smith 
February 1980 ACM SIGART Bulletin, issue 70 

Full text available: ^pdf(13.13 MB) Additional Information: full citation , abstract 

In the fall of 1978 we decided to produce a special issue of the SIGART Newsletter devoted 
to a survey of current knowledge representation research. We felt that there were twe 
useful functions such an issue could serve. First, we hoped to elicit a clear picture of how 
people working in this subdiscipline understand knowledge representation research, to 
illuminate the issues on which current research is focused, and to catalogue what 
approaches and techniques are currently being developed. Secon ... 

11 A structural view of the Cedar programming environment 1 
Daniel C. Swinehart, Polle T. Zellweger, Richard J. Beach, Robert B. Hagmann 

August 1986 ACM Transactions on Programming Languages and Systems (TOPLAS), 

Volume 8 Issue 4 

r- „. u. 0 - c no um Additional Information: full citation , abstract, references , citings, index 

Full text available: TO pdf(6.32 MB) A 

terms 

This paper presents an overview of the Cedar programming environment, focusing on its 
overall structure— that is, the major components of Cedar and the way they are organized. 
Cedar supports the development of programs written in a single programming language, 
also called Cedar. Its primary purpose is to increase the productivity of programmers whose 
activities include experimental programming and the development of prototype software 
systems for a high-performance personal computer. T ... 

12 An advanced full-text retrieval and analysis system 
J. Smith, S. Weiss, G. Ferguson 

November 1987 Proceedings of the 10th annual international ACM SIGIR conference on 

Research and development in information retrieval 

i. * ^ i ui 0 , f/ft nn i^dv Additional Information: full citation , abstract , references , citings, index 

Full text available: fig pdf(900.69 KB) A 

K_r* terms 

MICROARRAS is an advanced full-text retrieval and analysis system. It supports fast, 
efficient browsing of a document's vocabulary as well as its text, recursive analytic 
categories, Boolean search with flexible context specifications, evaluation of arithmetic . 
expressions, and graphical display of various numeric distributions. The system is designed 
to work with large textbases stored on remote mainframes or on a local store for a micro- 
computer or workstation. The description covers syste ... 

13 The transport layer: tutorial and survey 
Sami Iren, Paul D. Amer, Phillip T. Conrad 

December 1999 ACM Computing Surveys (CSUR), volume 31 issue 4 

r- .. * , 0 . f/ , e , Additional Information: full citation , abstract , references , citings , index 

Full text available: TO pdfl261 .78 KB) A 

tey*^-' terms 

Transport layer protocols provide for end-to-end communication between two or more 
hosts. This paper presents a tutorial on transport layer concepts and terminology, and a 
survey of transport layer services and protocols. The transport layer protocol TCP is used as 
a reference point, and compared and contrasted with nineteen other protocols designed 
over the past two decades. The service and protocol features of twelve of the most 
important protocols are summarized in both text and tables. < ... 

Keywords: TCP/IP networks, congestion control, flow control, transport protocol, transport 
service 
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This paper is a survey of current methods for the on-line creation and editing of computer 
programs and of ordinary manuscripts text. The characteristics of on-line editing systems 
are examined and examples of various implementations are described in three categories: 
program editors, text editors, and terminals with local editing facilities. 

15 The Satchel system architecture: mobile access to documents and services 
Mike Flynn, David Pendlebury, Chris Jones, Marge Eldridge, Mik Lamming 
December 2000 Mobile Networks and Applications, volume 5 issue 4 
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Mobile professionals require access to documents and document&dash; related services, 
such as printing, wherever they may be. They may also wish to give documents to 
colleagues electronically, as easily as with paper, face&dash;to&dash;face, and with similar 
security characteristics. The Satchel system provides such capabilities in the form of a 
mobile browser, implemented on a device that professional people would be likely to carry 
anyway, such as a pager or mobile phone. Printing may be per ... 
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Peter Kirstein, Goli Montasser-Kohsari 

June 1996 Communications of the ACM, volume 39 issue 6 

Full text available: ^ pdf(1.24 MB) Additional Information: full citation , references , index terms , review 



17 Exchanging APL workspaces (tutorial session) B 
Harry Bertuccelli 

August 1989 Proceedings of the ACM/SIGAPL conference on APL as a tool of thought 
(session tutorials) 

Full text available: ^ pdf(1.31 MB) Additional Information: full citation , index terms 



18 Communication complexity of document exchange 
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